127 research outputs found
Unsupervised learning of object landmarks by factorized spatial embeddings
Learning automatically the structure of object categories remains an
important open problem in computer vision. In this paper, we propose a novel
unsupervised approach that can discover and learn landmarks in object
categories, thus characterizing their structure. Our approach is based on
factorizing image deformations, as induced by a viewpoint change or an object
deformation, by learning a deep neural network that detects landmarks
consistently with such visual effects. Furthermore, we show that the learned
landmarks establish meaningful correspondences between different object
instances in a category without having to impose this requirement explicitly.
We assess the method qualitatively on a variety of object types, natural and
man-made. We also show that our unsupervised landmarks are highly predictive of
manually-annotated landmarks in face benchmark datasets, and can be used to
regress these with a high degree of accuracy.Comment: To be published in ICCV 201
Weakly Supervised Deep Detection Networks
Weakly supervised learning of object detection is an important problem in image understanding that still does not have a satisfactory solution. In this paper, we address this problem by exploiting the power of deep convolutional neural networks pre-trained on large-scale image-level classification tasks. We propose a weakly supervised deep detection architecture that modifies one such network to operate at the level of image regions, performing simultaneously region selection and classification. Trained as an image classifier, the architecture implicitly learns object detectors that are better than alternative weakly supervised detection systems on the PASCAL VOC data. The model, which is a simple and elegant end-to-end architecture, outperforms standard data augmentation and fine-tuning techniques for the task of image-level classification as well
Dataset Condensation with Differentiable Siamese Augmentation
In many machine learning problems, large-scale datasets have become the
de-facto standard to train state-of-the-art deep networks at the price of heavy
computation load. In this paper, we focus on condensing large training sets
into significantly smaller synthetic sets which can be used to train deep
neural networks from scratch with minimum drop in performance. Inspired from
the recent training set synthesis methods, we propose Differentiable Siamese
Augmentation that enables effective use of data augmentation to synthesize more
informative synthetic images and thus achieves better performance when training
networks with augmentations. Experiments on multiple image classification
benchmarks demonstrate that the proposed method obtains substantial gains over
the state-of-the-art, 7% improvements on CIFAR10 and CIFAR100 datasets. We show
with only less than 1% data that our method achieves 99.6%, 94.9%, 88.5%, 71.5%
relative performance on MNIST, FashionMNIST, SVHN, CIFAR10 respectively. We
also explore the use of our method in continual learning and neural
architecture search, and show promising results
Dataset Condensation with Distribution Matching
Computational cost of training state-of-the-art deep models in many learning
problems is rapidly increasing due to more sophisticated models and larger
datasets. A recent promising direction for reducing training cost is dataset
condensation that aims to replace the original large training set with a
significantly smaller learned synthetic set while preserving the original
information. While training deep models on the small set of condensed images
can be extremely fast, their synthesis remains computationally expensive due to
the complex bi-level optimization and second-order derivative computation. In
this work, we propose a simple yet effective method that synthesizes condensed
images by matching feature distributions of the synthetic and original training
images in many sampled embedding spaces. Our method significantly reduces the
synthesis cost while achieving comparable or better performance. Thanks to its
efficiency, we apply our method to more realistic and larger datasets with
sophisticated neural architectures and obtain a significant performance boost.
We also show promising practical benefits of our method in continual learning
and neural architecture search
Novel estimation and control techniques in micromanipulation using vision and force feedback
With the recent advances in the fields of micro and nanotechnology, there has been growing interest for complex micromanipulation and microassembly strategies. Despite the fact that many commercially available micro devices such as the key components in automobile airbags, ink-jet printers and projection display systems are currently produced in a batch technique with little assembly, many other products such as read/write heads for hard disks and fiber optics assemblies require flexible precision assemblies. Furthermore, many biological micromanipulations such as invitro-fertilization, cell characterization and treatment rely on the ability of human operators. Requirement of high-precision, repeatable and financially viable operations in these tasks has given rise to the elimination of direct human involvement, and autonomy in micromanipulation and microassembly. In this thesis, a fully automated dexterous micromanipulation strategy based on vision and force feedback is developed. More specifically, a robust vision based control architecture is proposed and implemented to compensate errors due to the uncertainties about the position, behavior and shape of the microobjects to be manipulated. Moreover, novel estimators are designed to identify the system and to characterize the mechanical properties of the biological structures through a synthesis of concepts from the computer vision, estimation and control theory. Estimated mechanical parameters are utilized to reconstruct the imposed force on a biomembrane and to provide the adequate information to control the position, velocity and acceleration of the probe without damaging the cell/tissue during an injection task
Mode Normalization
Normalization methods are a central building block in the deep learning
toolbox. They accelerate and stabilize training, while decreasing the
dependence on manually tuned learning rate schedules. When learning from
multi-modal distributions, the effectiveness of batch normalization (BN),
arguably the most prominent normalization method, is reduced. As a remedy, we
propose a more flexible approach: by extending the normalization to more than a
single mean and variance, we detect modes of data on-the-fly, jointly
normalizing samples that share common features. We demonstrate that our method
outperforms BN and other widely used normalization techniques in several
experiments, including single and multi-task datasets
- …